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SEQUENTIAL MONITORING OF RESPONSE-ADAPTIVE 
RANDOMIZED CLINICAL TRIALS 
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University of Virginia 

Clinical trials are complex and usually involve multiple objec- 
tives such as controlling type I error rate, increasing power to de- 
tect treatment difference, assigning more patients to better treat- 
ment, and more. In literature, both response-adaptive randomization 
(RAR) procedures (by changing randomization procedure sequen- 
tially) and sequential monitoring (by changing analysis procedure 
sequentially) have been proposed to achieve these objectives to some 
degree. In this paper, we propose to sequentially monitor response- 
adaptive randomized clinical trial and study it's properties. We prove 
that the sequential test statistics of the new procedure converge to 
a Brownian motion in distribution. Further, we show that the se- 
quential test statistics asymptotically satisfy the canonical joint dis- 
tribution defined in Jennison and Turnbull (2000). Therefore, type I 
error and other objectives can be achieved theoretically by selecting 
appropriate boundaries. These results open a door to sequentially 
monitor response-adaptive randomized clinical trials in practice. We 
can also observe from the simulation studies that, the proposed pro- 
cedure brings together the advantages of both techniques, in deal- 
ing with power, total sample size and total failure numbers, while 
keeps the type I error. In addition, we illustrate the characteristics of 
the proposed procedure by redesigning a well-known clinical trial of 
maternal-infant HIV transmission. 

1. Introduction. Clinical trials usually involve multiple competing ob- 
jectives such as maximizing the power of detecting clinical difference among 
treatments, minimizing total sample size and protecting more people from 
possibly inferior treatments. To achieve these objectives, two different tech- 
niques have been proposed in literature: (i) the analysis approach — by ana- 
lyzing the observed data sequentially [sequential monitoring, Jennison and 
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Turnbull (2000)], and (ii) the design approach — by changing the allocation 
probability sequentially [response-adaptive randomization, Hu and Rosen- 
berger (2006)]. In this paper, we discuss how to combine the two procedures 
in one clinical trial in order to utilize both of their advantages. 

In experiments where data accumulates sequentially, it is natural to con- 
duct a sequential analysis. Sequential techniques originated from a method- 
ology of long history based on Brownian motion. Wald's classic work about 
the sequential probability ratio test (SPRT) [Wald (1947)] led to the appli- 
cation of sequential analysis in numerous fields of statistics. Armitage (1957, 
1975) introduced sequential methods to clinical studies, which required mon- 
itoring results on a patient-by-patient basis. Pocock (1977) proposed sequen- 
tial monitoring of clinical trials based on a group basis. Since then, many 
authors have done important work on group sequential studies. These work 
are summarized in Jennison and Turnbull (2000) and Proschan, Lan and 
Wittes (2006). 

The main advantages of sequential monitoring were listed in Jennison 
and Turnbull (2000). First, it is ethical to monitor clinical trials sequen- 
tially because we could ensure that patients are not exposed to dangerous 
treatments and we could stop trials as soon as possible if needed. Second, 
administratively one needs to ensure that the protocol is not violated and 
the assumption, which the clinical trial is based on, is correct and valid. 
Third, sequential monitoring can decrease sample size and cost. With all 
the above advantages, sequential monitoring has now become a standard 
technique in conducting clinical trials. 

The idea of response-adaptive randomization (RAR) can be traced back 
to Thompson (1933) and Robbins (1952). The play-the-winner rule [Zelen 
(1969)] and the randomized play-the-winner rule [Wei and Durham (1978)] 
were proposed to reduce number of patients in the inferior treatments. Hu 
and Rosenberger (2003) proved theoretically that adaptive randomization 
can be used to increase statistical efficiency in some clinical trials. In lit- 
erature, many papers showed its efficient and ethical advantages over fixed 
designs [Hu and Rosenberger (2006)]. With modern technology and high ca- 
pability of collecting data, it becomes easier and easier to implement adap- 
tive designs in sequential experiments. Some clinical trials have already im- 
plemented the response-adaptive designs [Rout et al. (1993), Tamura et al. 
(1994), Andersen (1996), etc.]. 

Bayesian adaptive designs have also been proposed and studied in liter- 
ature. Berry (2005) provided some comprehensive introduction of Bayesian 
designs in clinical trials. Recently, Cheng and Shen (2005) proposed to se- 
quentially monitor a Bayesian adaptive design using decision-theoretic ap- 
proaches and allowing the maximum sample size to be sequentially adjusted 
by the observed data. Lewis, Lipsky and Berry (2007) proposed a Bayesian 
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decision-theoretic group sequential design for a disease with two possible out- 
comes based on a quadratic loss function. Wathen and Thall (2008) studied 
Bayesian adaptive model selection for optimizing group sequential clinical 
trials. In this paper, we focus on sequential monitoring of response-adaptive 
randomized clinical trials. 

Traditionally, sequential monitoring deals with fixed designs (usually with 
equal allocation). No systematic study is available about sequential monitor- 
ing a sequential experiment using response-adaptive randomization, except 
a simulation study by Coad and Rosenberger (1999). They found that the 
expected number of treatment failures can be further reduced by combining 
the triangular test with the randomized play-the-winner rule. In this pa- 
per, we will study both theoretical properties and finite sample properties 
of combining sequential monitoring with response-adaptive randomization. 

Sequential monitoring procedures use responses to stop or continue a clini- 
cal trial. Response-adaptive randomization procedures sequentially estimate 
the parameters and update the allocation probability of the next patient. To 
monitor a response-adaptive randomized clinical trial sequentially, one needs 
to study the two sequential procedures simultaneously. This is conceptually 
difficult because: (1) the number of patients assigned to each treatment is 
a random variable at each time point; (2) both the treatment assignments 
(probabilities) and the estimators of parameters (test statistics) depend on 
the responses at each time point. These problems arise from the sequen- 
tial updating of estimators of the parameters and the allocation probability 
function, which leads to difficulties in finding the joint distribution of se- 
quential test statistics. We overcome above difficulties by (i) approximating 
these different processes by martingale processes at each time point simulta- 
neously; (ii) then using continuous Gaussian approximation to study these 
martingale processes simultaneously. 

In this paper, we discuss sequential monitoring of doubly adaptive biased 
coin design proposed by Hu and Zhang (2004) for comparing two treat- 
ments. Under widely satisfied conditions, we show that the sequential test 
statistics converge to (i) a standard Brownian motion in distribution under 
null hypothesis; and (ii) a drifted Brownian motion in distribution under 
alternative hypothesis. For a standard Brownian motion, the critical value 
for fixed type I error rate has been well studied in literature. Therefore, the 
problem of controlling type I error is theoretically solved. Further, we show 
that the sequential test statistics satisfy the canonical joint distribution de- 
fined in Jennison and Turnbull (2000) asymptotically. Hence, one can apply 
the group sequential methods in the book to response-adaptive randomized 
clinical trials. 

Simulation results support our theoretical founds in terms of type I er- 
ror and display that sequential monitoring of response-adaptive randomiza- 
tion procedure could increase power and decrease total failure number. Also 
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compared to complete randomization, sequential monitoring of response- 
adaptive randomization procedure could stop earlier, and thus reduce the 
actual sample size. In other words, the proposed procedure achieves the 
goals of both RAR and sequential monitoring. We also redesign an exper- 
iment evaluating the effect of zidovudine treatment in reducing the risk of 
maternal- infant HIV transmission performed by Connor et al. (1994). The 
proposed procedure can be used to decrease the number of HIV infected 
people and increase the power comparing to the complete randomization. 

In Section 2, we introduce the notation, describe the framework and state 
the main theorem. In Sections 3 and 4, we use both generated data and real 
data to compare the proposed procedure with other randomization proce- 
dures. Conclusions are in Section 5 and technical proofs are given in the 
Appendix. 

2. Sequential monitoring of response-adaptive randomization procedures. 

2.1. Notation and framework. We first describe the framework for the 
randomized adaptive designs. In this article, we consider clinical trials with 
two treatments 1 and 2. Let Tj = (Tji,Tj2) = (1,0), i = l,...,n, if the 
ith patient is assigned to treatment 1, and (0,1) otherwise, where n is 
the sample size. N(n) = (Ni(n), iV^n)), where Nj(n) = X^iLi = l'^> 
is the number of patients in treatment j. Let X = (Xi, . . . ,X„)', where 
Xj = (Xji,Xj2),« = 1, . . . ,n, is a random matrix of responses variable and 
Xj,-,j = 1,2, are (i-dimensional random vectors. Here, only one element of 
Xj, say Xjj, can be observed if the ith patient is assigned to treatment j. 
We assume that Xi, . . . ,X n are independent and identical distributed with 
unknown parameter (9i,02), where Oj is the corresponding dj -dimensional 
parameter vector (6ji, ■ ■ ■ , Ojd ) of treatment j (j = 1,2). To simplify the no- 
tation, we assume that the parameter vectors of both treatments have the 
same dimension (d\ = di = d). Without loss of generality, we also assume 
that Oj = E(X.ij). Otherwise, we can transform X and treat the transforma- 
tion as responses to make the former equation hold if such transformation 
exists. Such transformation usually exists asymptotically. See Gwise, Hu and 
Hu (2008) and Hu and Zhang (2004) for further discussion. 

Let [nt] denote the largest integer that is smaller than or equal to nt for t £ 

[0,1]. Then N([nt]) = (iVi(N), N 2 ([nt])) and Nj([nt]) = ES^-,j = 1,2. 
Note that t = N/n when N is the number of patients who have already been 
enrolled. We introduce the so-called information time t in order to formulate 
this problem into the Skorohod topology [Ethier and Kurts (1986)]. After 
N = [nt] patients have been assigned and the responses observed, we use 
the modified sample means \ n t] = (^[nt],i> ^fntp) to estimate the parameter 
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= (6 1 ,6 2 ), that is, 

E[nt] rp Via V^[ n *] rp V _L_ Q 

{ '- L) "™' 1 = AM[nt]) + l ^ ° [nt] > 2= N 2 ([nt]) + 1 ' 

Here, we add 1 in the denominator to prevent discontinuity, and add 9o,j, 
say 0.5, to estimate 6j when no patient has been assigned to the treatment 
3, J = 1,2. 

Let p = (pi, p 2 ) be the target allocation proportion. Usually p is obtained 
based on some optimal criteria and depends on unknown parameter 6. The 
selection of p = p(0) has been studied by Hayre (1979), Jennison and Turn- 
bull (2000) and Tymofyeyev, Rosenberger and Hu (2007). In practice, the 
parameters are unknown. Therefore, we have to first estimate them accord- 
ing to previous treatment assignments and responses so that we can target 
the allocation proportion. We consider a general family of doubly adaptive 
biased coin design (DBCD) [Eisele and Woodroofe (1995)] here. 

Doubly adaptive biased coin design: (i) assign the first 2uq patients to 
treatment 1 and 2 by some restricted randomization procedures [permuted 
block or truncated binomial randomization, see Rosenberger and Lachin 
(2002)]; (ii) when the Ztti (I > 2uq) patient arrives and all the responses 
on the previous I — 1 patients are available, we compute Oi_i and p\-\ = 
p(0i_i); (iii) then assign the Ith patient to treatment 1 with probability 

0(#i(z-i)/(z-i),pi(d,_i)), 

where g(s,r) : [0, 1] x [0, 1] —> [0, 1] is the allocation function. Hu and Zhang 
(2004) proposed (7>0): 

5 {7) (0,r) = l, 
(2.2) 5 W(l,r) = 0, 



rir/sp + (1 - r)((l - r)/(l - s))T 



The design has drawn much attention since it was proposed and its advan- 
tages and properties can be found in Hu and Rosenberger (2003), Rosen- 
berger and Hu (2004) and Tymofyeyev, Rosenberger and Hu (2007). 

To compare two treatments in clinical trials, one consider a general hy- 
pothesis test: 

H :h(0 1 ) = h(0 2 ) versus fli:/i(0i) ^h(G 2 ), 

where h is a — > K function of parameters. In this paper, we assume h(0j) 
is continuous and twice differentiable in a small neighborhood of Oj, j = 1,2. 
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If one would like to test the above hypothesis at time point t £ (0, 1] , it is 
natural to construct the test statistic as 

(2.3) hmnt}))-HG 2 ([nt])) 



Var(h(e 1 ([nt])))+Var(h(e 2 ([nt]))) 

Here, V&r(h(0i([nt]))) and Vax(h(02([nt]))) are some consistent estimators 
of the variances of h{6\{[nt})) and h(02([nt])), respectively. There is no co- 
variance term on the denominator since the two terms on the numerator are 
asymptotically independent [Hu, Rosenberger and Zhang (2006)]. Without 
loss of generality, we also assume that for some functions v\ and v 2 

[nt^ih^dnt]))) = Vj (^p, 0(H)) (1 + o(l)) a.s. j = 1, 2. 

It is easy to see that both Vj(y, z) and Z t (y, z) are $t 2+2d — > K function, where 
y is a two-dimensional vector and z is a 2d-dimensional vector. Examples of 
using this formulation are discussed in Section 2.3. 

2.2. Main results. Based on the notation in Section 2.1, we observe the 
random processes (T x , . . . , T [nt] ), (X x , . . . , X [nt] ), N([nt]), [nt] , p{0 [nt] ) and 
Zt at time point t. When a response-adaptive randomization procedure is 
used, these random processes have the following characteristics different 
from those in fixed designs: 

(1) The allocation (N([nt])) at any time t is a random vector instead of a 
constant in fixed designs. 

(2) The allocation (N([ni])) and (Ti, . . . , T[ n4 j) are not independent with 

the responses (Xi, . . . ,Xr nt ]) and the parameter estimator vector 0\ n f\. 

(3) The elements 0i[nt] and 9 2 [nt] depend on each other at any given time 

te(o,i]. 

These differences directly lead to difficulties in deriving the joint distribu- 
tions of sequential testing statistics. 

To sequentially monitor a clinical trial, we need to figure out how to 
control the type I error. The answer to this question relies on the derivation 
of the asymptotical joint distribution of the sequential statistics and right 
choices of the boundaries. Before we give the main theorem, we need the 
following conditions for the response X, target allocation p(0), allocation 
function g and the function Vj(y,z),j = 1,2. 

(Al) For some e > 0, £||Xi || 2+e < 00; 

(A2) g(s,r) is jointly continuous and twice differentiable at (pi,pi); 
(A3) g(r,r) = r for all r S (0, 1) and g(s,r) is strictly decreasing in s and 
strictly increasing in r on (0, 1) x (0, 1); 
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(A4) p(z) is a continuous function and twice continuously differentiable 
in a small neighborhood of 0; 

(A5) Vj (y , z) is jointly continuous and twice differentiable in a small neigh- 
borhood of (p, 0); 

(A6) Zt(y,z) is a continuous function and it is twice continuously differ- 
entiable in a small neighborhood of vector (p,9). 

Remark 2.1. All the conditions are widely satisfied. An example of a 
design which satisfies these conditions is DBCD in Hu and Zhang (2004). 
Condition (Al) is used to ensure the consistency of the procedure and 
asymptotic normality of the allocation proportions. Condition (A3) forces 
the actual allocation proportion to approach the theoretically targeted one. 
Conditions (A4), (A5) and (A6) are satisfied in all the examples in Chapter 
5 of Hu and Rosenberger (2006). 

Theorem 2.1. Let Bt = \JtZt in the space -D[o,i] with Skorohod topology. 
Assume conditions (Al)-(A6) are satisfied. Then we have the following two 
results: 

(i) Under Hq, Bt is asymptotically a standard Brownian motion in dis- 
tribution. 

(ii) Under H\, Bt — \fri\it is asymptotically a standard Brownian motion 
in distribution, where 

(h(fli) - h{6 2 )) 
^ v / v 1 (p,6)+v 2 (p,0y 

Based on Theorem 2.1, we can obtain the asymptotical distribution of the 
sequence of test statistics {Z tl , . . . , Z tK }, where < t± < t 2 < • • ■ < t,K < 1- 
Because Z% i = (y^) -1 !?^, we have asymptotically: 

(i) {Zt x , . . . , Zt K } is multivariate normal; 

(ii) EZ ti = fj-^/nfi; and 

(iii) Cov(Z u , Z t . ) = ^[nttj/intj], 0<U<tj<l. 

Therefore, the sequence of test statistics {Zt ± , . . . , Zt K } has the asymp- 
totical canonical joint distribution defined in Jennison and Turnbull (2000). 

Remark 2.2. Based on the canonical joint distribution of the sequence 
of test statistics {Z tl , . . . , Z tK }, we can see that the doubly adaptive biased 
coin design has a simple form of information time, which is just the pro- 
portion of the sample size enrolled. This is because the DBCD consistently 
allocates same proportion of patients to different treatments from the be- 
ginning to the end asymptotically. We conjecture that this simple form of 
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information time is true for most response-adaptive randomization proce- 
dures. 

Based on Theorem 2.1, we can easily choose the correct critical values 
for the asymptotic Brownian process, so that the inflation of the type I er- 
ror will be avoided. Moreover, we can also make use of all the well-known 
properties of Brownian process to do further analysis on the process of 
sequentially monitoring a response-adaptive randomization procedure. Be- 
cause {Zt x , • • • , Zt K } satisfies the canonical joint distribution asymptotically, 
we can apply the sequential techniques in Chapters 2, 3, 4, 5, 6, 7 of Jenni- 
son and Turnbull (2000) to response-adaptive randomized clinical trials. We 
may also apply different types of spending functions to monitor a response- 
adaptive randomized clinical trial sequentially. Here, we will use a spending 
functions proposed by Lan and DeMets (1983). 

Any increasing function a(t) defined on [0, 1] with a(0) = and a(l) = a 
is called a a spending function. We spend a(ti) — a(U-i) of the total type I 
error rate at time point ij, so that a(ij) has been spent after this point. For 
time ti, i = 1, 2, . . . , we can sequentially obtain the boundaries. This method 
does not require the predetermined number of looks and equally spaced 
looks. We can perform the interim monitor anytime during the trial. Such 
a procedure is usually preferred by Data and Safety Monitoring Boards 
(DSMB). Proschan, Lan and Wittes (2006) provided three special spend- 
ing functions. The first one approximates the O'Brien-Fleming boundaries 
[O'Brien and Fleming (1979)] 

a 1 (t) = 2{l-<Z>(z a/2 /t 1 / 2 )}. 

The second one is the linear spending function: 

&2(t) = at. 

The third one approximates the Pocock boundaries [Pocock (1982)]: 

a 3 (t) =aln{l + (e- l)t}. 

The O'Brien-Fleming-like function spends little of the type I error at early 
looks. Consequently, the boundary for the last look is very close to what it 
would have been without sequential monitoring. Conversely, the Pocock-like 
function rejects the null hypothesis easier with smaller boundaries for early 
looks and then has to use a reasonably large critical value at the end to 
keep the type I error. The linear function is between these two. Therefore, 
the three functions above represent three typical types of spending func- 
tion. Finally, it is worth mentioning that these three spending functions are 
corresponding to the process Z t . 



SEQUENTIAL MONITORING OF RESPONSE- ADAPTIVE TRIALS 9 

2.3. Examples. Here, we use two examples to illustrate how to sequen- 
tially monitor the response-adaptive randomization procedures based on 
Theorem 2.1. 

Example 1 (Continuous responses from normal populations). Suppose 
the responses of the two treatments are from two normal distributions Yn ~ 
N(ni,a\) and Yi 2 ~ N(p 2 , <r 2 ), i = 1, . . . , n. We would like to compare pi and 
p 2 . In this case, 0i = (pi,af + pf), 2 = {jpi,o\ + p%), X„ = (Yij,Y?j) and 
h(6j) = Oji = fij,j = 1,2. Then the hypotheses are 

H :pi = p 2 versus R\\p\i^p 2 . 

Let target allocation proportion be the Neyman allocation [Jennison and 
Turnbull (2000)] with 

(2.4) pi = — — — and p 2 = 1 - pi = — — — . 

<J\ + a 2 <j\+ a 2 

We can use other target allocation proportions, for example, the optimal 
allocation proportion [Zhang and Rosenberger (2006)] and the Z)yi-optimal 
allocation proportion [Gwise, Hu and Hu (2008)]. The sequential statistics 
Zf(y,z) is a function from 5i 6 to 3ft: 

ry f \ Zll- Z 21 



V( z i2 - ifijAKjyi) + (z 22 - ) /([nt] 2/2 ) 

where y = N([nt])/[nt] and z = 6 = n ([nt}) , e 12 ([nt}) , 2 i([nt]), e 22 ([nt})). 
It is easy to see that h(0i([nt])) = pi([nt]) and h(0 2 ([nt])) = ft 2 ([ni\). Also 
the natural variance estimators are 

Var(/ l (0 1 ([nt]))) = ^|^ and V^(h(G 2 ([nt)))) = 

where af([nt]) and <5"|([ni]) are the usual unbiased estimators of o\ and a\ 
based on the first [nt] responses (Ni([nt]) from treatment 1 and N 2 ([nt]) 
from treatment 2), respectively. Therefore, 

2 2 

Vl (p,0) = ^ and V2 (p,0) = ^. 

Pi P2 

The test statistic is then 

pi([nt]) - /t 2 (H) 



(2 ' 5) ^ V*? ( N ) M ( M ) + *2 ( [nt] ) /^2 ( [nt] ) " 

Then based on Theorem 2.1, the joint distribution of Bt = \ftZt is asymp- 
totically a standard Brownian process under Hq. Under H\, Bt — y/npt is 
asymptotically a standard Brownian motion in distribution, where 

Pi ~ P2 



/' : 



V^/pi+a 2 2 /(l-pi) 
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Example 2 (Binary responses). Assume Yn ~ Bin(l,pi) and Yi 2 ~ Bin(l, 
p 2 ),i = l,...,ra, and we would like to compare p\ and p2- In this case, 
#i = (pi), 2 = (P2), x ij = (Yij) and /i(0j) = 9 jl ,j = 1,2. The hypotheses 
are 

H Q :pi= p 2 versus H\\p\^p 2 . 
Three common target allocations are: (i) Neyman allocation, 



Vpi(1 - Pi) , 
Pi = i = 1 = and 

VPl(l-Pl) + VP2(1 -P2) 

(2-6) 

a/P2(1 -P2) 

P2 " 



V^i(I -pi) + VP2T1 -P2)' 

(ii) optimal allocation proposed by Rosenberger et al. (2001), 



( 2 -7) Pi = ;== , r- and p 2 = — ; 

(iii) Urn allocation [Wei and Durham (1978)], 
(2.8) p\ = — — — and p 2 - 



Qi + <?2 qi+ Q2 

Neyman allocation is a commonly discussed allocation which is related 
to the efficiency issue in the field of response-adaptive randomization proce- 
dures. We studied sequential monitoring of response-adaptive designs with 
Neyman allocation in order to show that our proposed procedure is able to 
achieve various objects. 

In this case, Zt(y,z) is a function from 5R 4 to 3ft: 



y 7 z u (l - Zii)/{[nt]yi) + z 2 i{l - z 2 \) / {[nt\y 2 ) ' 



where y = {N x {[nt])/[nt],N 2 {[nt])/[nt}) } z = (MM)> 9 2l ([nt))), h(e 1 ([n x 
t])) = pi([nt]) and h(6 2 ([nt])) =p 2 ([ni\). The corresponding variance esti- 
mators are 

pi([nt])(l-pi([nt])) 



and 



Therefore, 



Var(/ l (0 1 ([nt]))) 



Var(/t(0 2 (H))) 



^i(H) 

P2 ([nt])(l-p 2 (N)) 
iV 2 (H) 



ui(p,0) = and v 2 (p,9)- 



Pl P2 
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The test statistic is 



Zt = (pi([nt])-p 2 ([nt])) 



(2.9) 



x 



( 



+ P2 (["*]) 



1-P2([nt]) 
iV 2 (H) 



) 



Then S t = y 7 ^ converges to a standard Brownian process in distribution 
under Hq. Under H±, Bt — y/rifit is asymptotically a standard Brownian 
motion in distribution, where 



Theorem 2.1 can be applied to different situations such as the examples 
considered in Chapter 5 of Hu and Rosenberger (2006). In Examples 1 and 
2, now assume we would like to look at the process at three points: t\ = 0.2, 
ti = 0.5 and t% = l. Then we can use the corresponding critical values from 
the three spending functions [Proschan, Lan and Wittes (2006)] in the last 
subsection for Zt to keep the overall type I error 0.05: O'Brien-Fleming- like 
boundaries (4.877, 2.963, 1.969), linear boundaries (2.576, 2.377, 2.141) and 
Pocock-like boundaries (2.438, 2.333, 2.225). 

3. Simulation study. In Section 2, we obtained the asymptotical distri- 
bution of the test statistic Zt- In this section, we will use the two examples 
in Section 2 to study the finite sample properties of the proposed procedure. 

In Examples 1 and 2, we use the doubly adaptive biased coin design with 
Hu and Zhang's allocation function in (2.2) and 7 = 2 is used. In Tables 
1-5, we use the same total sample size 500. The first 50 patients (no = 
25) are randomly assigned to treatments 1 and 2 by using permuted block 
randomization. Then, for the /th (/ > 50) patient, the unknown parameters 
are estimated by using (2.1) based on the first I — 1 responses with 0q,i = 
$0,2 = 0.5. For normal responses in Example 1, we estimate a\ and a\ by 
using the standard unbiased estimators based on the first I — 1 responses. 

For simplicity, we look at the test at three time points [n\ = 100 (ij = 0.2), 
tt-2 = 250 (t2 = 0.5) and n = 500 (£3 = 1)]. Then the three sets of spending 
function boundaries in Section 2.3 are used to ensure a = 0.05. For each 
spending function, the first row in the table is for DBCD and the second 
row is for complete randomization (denoted as CR in the tables). All the 
simulations are based on 5000 replications. 

In Table 1, we simulate Example 1 with two normal responses iV(l, 1) 
and N(l,2) by using the Neyman allocation (2.4). We find that the type I 
error of sequentially monitoring the response-adaptive randomization pro- 
cedure and complete randomization are both well kept at the 0.05 level. We 



P1-P2 



a/pi(1 ~P\)Ip\ +P2OL -P2)/(l - Pi) 
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Table 1 



Example 1 with Neyman allocation, 


fl 1= fM 2 = 1, a\ — 1, <7; 


!=2 


Critical values 


Randomization 


Type I error 


Pi (s.e.) 


B-F-like 


DBCD 


0.055 


0.333 (0.020) 


B-F-like 


CR 


0.052 


0.500 (0.022) 


Linear 


DBCD 


0.048 


0.333 (0.020) 


Linear 


CR 


0.053 


0.500 (0.023) 


Pocock-like 


DBCD 


0.051 


0.332 (0.020) 


Pocock-likc 


CR 


0.052 


0.500 (0.023) 


also report the 


mean and standard deviation of actual allocation proportion 


(pi) for treatment 1 [iV(l,l)]. We find that the mean agrees 


: with Neyman 


allocation and the standard deviation is 


reasonably small for DBCD. This 


indicates that the DBCD is able to target the theoretical targ 


eted allocation 


proportion very well. In Table 2, we simulate the Example 2 with two binary 


responses p\ = 


P2 = 0.5 and the target allocation is the optimal allocation 


(2.7). We obtain the same conclusion as Table 1. We have also done simula- 


tions for some other cases, and similar results are obtained. These numerical 


results indicate 


that sequential monitorin; 


I of the response-adaptive random- 




Table 2 








Example 2 with optimal allocation, p\ — p2 = 0.5 




Critical values 


Randomization 


Type I error 


Pi (s.e.) 


B-F-like 


DBCD 


0.051 


0.500 (0.016) 


B-F-like 


CR 


0.046 


0.500 (0.023) 


Linear 


DBCD 


0.055 


0.500 (0.019) 


Linear 


CR 


0.061 


0.500 (0.023) 


Pocock-likc 


DBCD 


0.056 


0.500 (0.019) 


Pocock-like 


CR 


0.050 


0.500 (0.022) 




Table 3 




Example 1 with Neyman allocation, fii = 1, fi 2 = 1.4, ffi = 1. 


(72 = 2 


Critical values 


Randomization Power 


pi (s.e.) Ni 


N 2 N 3 


B-F-like 


DBCD 0.847 


0.333 (0.021) 2 


1013 3222 


B-F-like 


CR 0.807 


0.500 (0.024) 1 


842 3193 


Linear 


DBCD 0.812 


0.332 (0.027) 594 


1429 2035 


Linear 


CR 0.765 


0.500 (0.028) 477 


1380 1970 


Pocock-like 


DBCD 0.792 


0.332 (0.028) 741 


1443 1774 


Pocock-likc 


CR 0.738 


0.500 (0.028) 544 


1309 1835 
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ization will not innate the type I error with the appropriate boundaries based 
on Theorem 2.1. 

Next, we show other advantages of the sequential monitoring of the response- 
adaptive randomization procedure. In Table 3, we simulate Example 1 with 
two normal responses iV(l,l) and N(1A, 2) using Neyman allocation (2.4) 
as the target allocation that maximizes the power. The power of the sequen- 
tial monitoring of the response-adaptive randomization procedure is about 
5%-8% higher than sequentially monitoring the complete randomization. 
Ni in the table is the number of rejections at the ith look. Rejection at 
the first two looks means stopping the trial earlier. DBCD with sequential 
monitoring obviously stops the trial earlier than complete randomization. 

In Table 4, we simulate Example 2 with two binary responses p\ = 0.5 
and p2 = 0.625 using the urn allocation (2.8) as the target allocation that 
assigns more people to the better treatment. If we reject the null hypothesis 
at the first two looks, we assign all the remaining patients to the estimated 
better treatment and count the total failure number. We do this only for the 
comparison in the simulation study. In a real clinical trial, we stop the trial if 
the null hypothesis is rejected at an interim look. From the mean total failure 
number, the DBCD with sequential monitoring has lower failure numbers 

Table 4 





Example 


2 with urn 


allocation, pi 


= 0.5, p 2 = 


= 0.625 




Critical values 


Randomization Power 


Pi (s.e.) 


Nx 


N 2 


N a 


Total failures (s.e.) 


B-F-like 


DBCD 


0.811 


0.426 (0.033) 


4 


839 


3214 


211 (13) 


B-F-like 


CR 


0.811 


0.500 (0.024) 


1 


839 


3215 


217 (13) 


Linear 


DBCD 


0.762 


0.421 (0.041) 


503 


1396 


1912 


206 (14) 


Linear 


CR 


0.767 


0.500 (0.029) 


521 


1300 


2016 


212 (14) 


Pocock-like 


DBCD 


0.749 


0.421 (0.042) 


609 


1325 


1809 


205 (14) 


Pocock-like 


CR 


0.738 


0.501 (0.029) 


603 


1312 


1773 


211 (15) 



Table 5 

Example 2 with optimal allocation, pi = 0.5, p2 = 0.625 



Critical values 


Randomization 


Power 


pi (s.e.) 


Ni 


N 2 


N 3 


Total failures (s.e.) 


B-F-like 


DBCD 


0.810 


0.471 (0.017) 


4 


863 


3185 


214 (12) 


B-F-like 


CR 


0.805 


0.501 (0.024) 


4 


795 


3229 


218 (13) 


Linear 


DBCD 


0.768 


0.468 (0.022) 


520 


1354 


1964 


210 (14) 


Linear 


CR 


0.762 


0.500 (0.029) 


474 


1367 


1971 


214 (14) 


Pocock-like 


DBCD 


0.754 


0.469 (0.023) 


673 


1309 


1787 


210 (14) 


Pocock-like 


CR 


0.749 


0.500 (0.03) 


602 


1351 


1793 


213 (15) 


1.96 


DBCD 


0.805 


0.472 (0.015) 


NA 


NA 


NA 


217 (11) 


1.96 


CR 


0.802 


0.500 (0.022) 


NA 


NA 


NA 


221 (11) 
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Table 6 

Re-designed the HIV trial with full sample size 



Target allocation 


Critical values 


Pi (s.e.) 


Power 


Total failures (s.e.) 


CR 


linear 


0.500 (0.039) 


0.999 


60.1 (11.1) 


CR 


1.96 


0.501 (0.023) 


0.999 


80.7 (8.2) 


Urn allocation 


linear 


0.751 (0.062) 


0.996 


52.3 (9.2) 


Optimal allocation 


linear 


0.527 (0.021) 


0.997 


56.4 (10.8) 



than complete randomization for each type of spending function. N±, N2, 
and N% show that our methods stop the trial a little earlier and the power 
is almost the same. 

In Table 5, we simulate Example 2 with two binary responses p\ = 0.5 and 
P2 = 0.625 using the optimal allocation (2.7) used to maximize the power 
while keeping the total failure number. We deal with the remaining patients 
in the same way as in Table 4 if we reject the null hypothesis at the first 
two looks. We find that sequential monitoring of the response- adaptive ran- 
domization procedure can achieve the aim of optimal allocation. Its power is 
larger and its failure number is less than the complete randomization proce- 
dure. In this table, we also do the simulation without sequential monitoring. 
That is, we only look at the test once at the end of the trial and the criti- 
cal value is 1.96 for the nominal significance level 0.05. We report it at the 
last two rows. It is obvious that sequential monitoring can reduce the total 
failures. 

Based on the simulation results, we can see the advantages of sequentially 
monitoring response-adaptive randomized clinical trials: (i) controlling type 
I error well; (ii) reducing the total number of failures; (hi) increasing power; 
and (iv) stopping the trail earlier (reducing total sample size). 

4. Re-designing the HIV transmission trial. Maternal-infant transmis- 
sion is the primary means by which infants are infected by HIV virus. Con- 
nor et al. (1994) reported a trial to evaluate the drug AZT (Zidovudine 
treatment) in reducing the risk of maternal-infant HIV transmission. In this 
clinical trial, 477 HIV-infected pregnant women were enrolled from April 
1991 to December 1993 and assigned to the Zidovudine treatment group 
and placebo group with a 50-50 randomization scheme. This experiment 
was a randomized, double-blind and placebo-controlled trial. 239 were allo- 
cated to the treatment group and 238 to the placebo group. At the end of 
the trial, 8.3% of the infant from the treatment group were infected by the 
HIV virus, while 25.5% from the placebo group were infected. 

In Table 6, we redesign the study by sequential monitoring of both com- 
plete randomization (the first two rows in the table) and response-adaptive 
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Table 7 

Re-designed the HIV trial with sample size n = 245 



Target allocation 


Critical values 


Pi (s.e.) 


Power 


Total failures (s.e.) 


CR 


B-F-like 


0.500 (0.036) 


0.947 


40.1 (7.0) 


CR 


linear 


0.501 (0.042) 


0.942 


36.6 (7.5) 


CR 


1.96 


0.500 (0.032) 


0.958 


43.1 (5.8) 


Urn allocation 


B-F-like 


0.745 (0.068) 


0.920 


30.7 (5.9) 


Urn allocation 


linear 


0.747 (0.074) 


0.885 


29.3 (6.1) 


Optimal allocation 


B-F-like 


0.528 (0.023) 


0.952 


36.8 (6.7) 


Optimal allocation 


linear 


0.529 (0.025) 


0.945 


32.8 (7.3) 



randomization [DBCD (2.2) with 7 = 2] (the last three rows in the table). We 
assume the success rate for the treatment group is pi = 0.917 and that for 
the placebo group is p2 = 0.745 (as reported in the original paper). We look 
at the test at the three same time points as mentioned in the last section, 
m = 95 (h = 0.2), n 2 = 143 (t 2 = 0.5) and n = 239 (t 3 = 1). The boundary 
we use is the linear spending function (2.576, 2.377, 2.141) except the second 
row in the table where we do the equal allocation without sequential moni- 
toring. We report the actual allocation proportion for the treatment group, 
power and the total HIV-infected number. As before, if we reject the null 
hypothesis at the first two looks, we will assign all the remaining patients 
to the estimated better treatment. We find that the sequential monitoring 
technique will decrease the HIV-infected number dramatically from the first 
two rows. Response-adaptive randomization technique will also reduce the 
HIV-infected number compared to the complete randomization. Sequential 
monitoring DBCD while targeting at the urn allocation has the least HIV- 
infected number, which agrees with the aim of urn allocation. 

In Table 7, we reduce the full sample size to 245 (to achieve power 0.95 
for complete randomization) and keep all the other settings unchanged. We 
obtain the same conclusion about the HIV-infected number as in Table 6. We 
also find that targeting optimal allocation with DBCD has slightly higher 
power than targeting equal allocation when sequential monitoring is used. 
Targeting urn allocation with DBCD has slightly less power but the HIV- 
infected number in this way is the least. Overall, sequential monitoring of the 
response-adaptive randomization procedure is better than that of complete 
randomization, since it reduces the HIV-infected number and remains good 
power. 

5. Conclusion remarks. Now sequential monitoring becomes a standard 
technique in clinical trials. To apply response-adaptive randomization in 
clinical trials, it is important to know how to sequentially monitor adaptive 
randomized trials. In this paper, we overcome this hurdle and show the 
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advantages of sequential monitoring response-adaptive randomized clinical 
trials both theoretically and numerically. We use a Gaussian process in the 
Skorohod topology to describe the relationship between the allocation and 
parameter estimators. One of the main contributions of this paper is to 
show that sequential statistics can be asymptotically approximated by a 
Brownian process in distribution under both null and alternative hypotheses. 
Further, we find that the sequential test statistics satisfy the canonical joint 
distribution asymptotically. Consequently, the results of this paper not only 
solve the problem of preserving a preset type I error but may lead to many 
area of potential future research. 

We have studied how to sequentially monitor a clinical trial based on 
doubly adaptive biased coin design proposed by Eisele and Woodroofe (1995) 
and Hu and Zhang (2004). Another important family of response-adaptive 
randomization procedure is based on urn models, which include randomized 
play-the- winner rule [Wei and Durham (1978)], generalized Friedman's urn 
models [Athreya and Karlin (1968), Bai and Hu (2005)], drop-the-loser rule 
[Ivanova (2003)], sequential estimation-adjusted urn models [Zhang, Hu and 
Cheung (2006)], etc. The technique used in this paper opens a door to study 
the properties of sequential monitoring of clinical trials based on these urn 
models or the efficient randomized adaptive designs [Hu, Zhang and He 
(2009)]. We leave this for future study. 

In this paper, we have used a-spending function to calculate the critical 
boundaries. Because the sequential test statistics satisfy the canonical joint 
distribution asymptotically, we can implement all the sequential techniques 
introduced in Jennison and Turnbull (2000) based on this canonical form. 
Also we can use the optimal spending functions in Anderson (2007), or the 
beta spending functions in DeMets (2006). We also leave the details for 
future research. 



APPENDIX: PROOFS 

First, we introduce some further notation. For a function r/(u, w) : 5R L x 
— > 5i 2 , we denote the partial derivative matrices as 



( ^-;i=l,...,L,k = l,2 



Lx2 



and 



V w (r,)=(p^;j = l,...,M,k = l,2 

\ 9w j / Mx2 

Let H = V r (g(r, s), l-g(r, s))|( pliPl ) and E = V s {g(r, s), 1 - g(r, s))|( pliPl ) be 
the partial derivative matrices of the allocation function g. Further, let V = 
diag(var(X 11 )/p 1 ,var(X 12 )/p 2 ), S 3 = (V(p)| )W(p)| e , Si = diag(p) -p'p 
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and £2 = E'Ti^E. In Hu and Zhang (2004), they studied the asymptotic 
properties of N(n), p(n) and 0(n) at the end of the trial. Based on their 
results, one can do the corresponding statistical inference after observing all 
responses of the clinical trial. To monitor the response-adaptive randomized 
trial sequentially, we need to know the theoretical properties of the process 
N([ni]) and 6([nt]) for any given t £ (0, 1]. To do this, we start with Lemma 
A.l. 

Lemma A.l. Let W\t and Wit be two independent standard 
two-dimensional Brownian processes. N([ni]), 6([nt]), p and are defined 
as in Section 2. Under the conditions of Theorem 2.1, we have 

(A.l) n -y\[ n t]){^^- p ;e{[nt])-e\^(G u WnV^ 

in distribution in the space D r 0)1 i with the Skorohod topology, where the Gaus- 
sian process 



- - dy 

x y\y 



(A.2) G t = j\dW lx )^\ /2 (^\ + j\dW 2x )T^ 2 
which is the solution of the stochastic differential equation 

dG t = (dW lt )T,\ /2 + 2t - 2 dt + -jH dt with G = 0, 
and a H is the matrix power function defined as 



a H = e Hlna ^ (lna) J t 



Proof. It is worth noting that the response-adaptive design in The- 
orem 2.1 satisfies all the conditions of Hu and Zhang (2004). So all the 
results in Hu and Zhang (2004) are valid. We will prove this lemma by using 
the weak convergence of the martingale [cf. Theorem 4.1 of Hall and Heyde 

(1980)]. To do this, we first approximate the process ( N ^"^ — p, 0([nt\) — 0) 
by a martingale and then prove the following two facts: (1) Lindeberg con- 
dition holds for the approximated martingale process; and (2) the limiting 
covariance of n _1//2 ([nt])(([nt])~ 1 N([nt]) — p,6([nt]) — 6) agrees with that 
of (G u W 2t V l l 2 ). 

Now, we use the martingale approximation of N(n) — np and 6(n) — 6 
from Hu and Zhang (2004). Let T m = c(Ti, . . . , T m , X ls . . . , X m ) be the o- 
field generated by the previous m stages. Then under T m -\, T. m and X. m 
are independent, and 

(NAm-l) 

E[T ml F m - 1 ]=g( —t,pi{m-l) 

V m — 1 



18 H. ZHU AND F. HU 

Let Q n = YjZi=\ A Q™, where AQ m = (AQ m ,i, AQ mj2 ) = (AQ mi i fc , &Q m ,2k\ 
k = l,...,d) and AQ mjfc = T m j(X m j k - 9 jk )/pj,j = 1,2. Then Q n = 
0(y/n log logn) a.s. is a sequence of martingales and we can prove 

(A.3) 0(n)-0 = ^ + o(^logrA a.s. 

n \ n J 

Let M n = ^ =1 AM m , where AM. m = T m - E[£ m \T m -\\, and B njm as 
defined in Hu and Zhang (2004), then 



n n 1 

N(n) - np = ^ AM mJ B n , m + ]T AQ m V(p)| £ -B n . k + ofa" 1 / 2 "'/ 3 ) 

m=l m=l k=m 

:=U n + fn- 1 /2-«/3 ) 



almost surely, where U n is a sum of martingale differences. 

We can approximate the process N([ni]) — [nt]p and 0([nt]) — 9 (for any 
point t £ (0, 1]) similarly as N(n) — np and 0(n) — 6. We obtain 

(A.4) fl M )_ = __ + O ^___j a .s. 

and 

N([nt]) - [nt]p 

[nt] [nt] [nt] 

= AM m B [nt]>m + AQ m V(p)| £^ -B [nt]tk + o({[nt})- l l 2 - 5 /*) 

m=l m=l k=m 

:=U M+0 ((M)- 1 / 2 -^) 
almost surely. 

Hu and Zhang (2004) proved that both martingales Q n and U n satisfy 
the Lindberg conditions. Similarly, we can show that both martingales Qr ret j 
and Ur„ t i also satisfy the Lindberg conditions. Now we just have to calculate 
the covariance matrix of the martingales Qr n £i and Ur nt i. First, based on the 
results of Hu and Zhang (2004), we have 



p(n)-p = oU l ^^\ and ^-p^^ 
almost surely. Therefore, for any t G (0, 1], we have 



p([nt])-p = 0[ x l^r i ) and 

(A.5) 

([nt])- 1 N([nt])-p = oU l -^ [nt] 



'log 


;iog| 


nt] 


[nt] 


'log 


;iog| 


nt] 
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almost surely. Now, we can calculate Var[AM[ nf ] | J r [ nt ]_ 1 ], Var[AQ[ nt ] l-^nij-i] 
and Cov[AM [rit ], AQ[^]|J"[^]_i]. 

First, AMui = Tr nt i — £?[Tr nt i |.F[ n t]-i] is a binary random vector. Based 
on conditions (A2), (A3) and (A. 5), we have 

(A.6) VarfAM^jl^^x] = Ei + o(l) 

almost surely. Similarly, we can show 

(A.7) Var[AQ H |^ M ] = y + (l) 

and 

(A.8) Cov[AM M , AQ^I^].!] = o(l) 

almost surely. 

Based on results (A.6), (A.7) and (A.8), it follows that for any < s < 
t<l, 

/ [ ns ] \. n A \ 



Cov[Q[ ns ],Q[ nt] ] = Cov ^2 A Qm> Y AC * r 



\m=l 



m=l 



= ns{V + o(l)) = nsV + o(n), 
Cov[V[ n8 ],U[nt]] = nA n (s,t) + o(n), 

" [ns] [nt] 

Y AQ m , Y AM m S 

[nt] ,r 



Cov[Q[ ns ],U [nt] ] = Cov 



m=l 



m=l 



[nt] 



[nt] 



\e ' k 

k=m 



Cov 



+ J2AQ m V(p)\Ej2lB [nthh 

m=l 
[ns] [nt] 
J2 AQ m , Y AM mB [nt] ,r 



.m=l 



m=l 



Cov 



[ns] [nt] [nt] 

J2 AQm, Y AQ m V(p)|^ Yl l B \nt],k 

k=m 



m=l 



m=l 



Cov 



[ns] [nt] [nt] ^ 

Y AQm ' AQ m v(p)|^ X: ^-B[nt],fe 



.m=l 



m=l 



[ns] / [nt] \ 

(w(p)l*£ + *(!))£ 2^ 

m=l \k=m / 
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nVV{p)\ e E [ dx 
Jo 

n A 2 i (s,t) + o(n), 



x y\y 



dy 



+ 0(11) 



and similarly, 



Cov[Q[ nt]) U[ ns] ] = n A 12 (s) + o(n), 



where 



Au(s,t) 



r/.r 



+ 



A 21 ( S) t)=yv(p)i E r dx i -f- 

Ai 2 (a) = 



r/.r 



x y\y 



dy 

E'V(p)\' e V. 



Therefore, the asymptotic covariance function of n 1/,2 (Ur n £i , Qr n t|) agrees 
with that of (Gt,W2tV 1 ^ 2 )- So by weak convergence of the martingale [cf. 
Theorem 4.1 of Hall and Heyde (1980)], we have 

n-^ (M )^^M) _ P) 0( M ) _ 0) _> (G t ,W2tV 1 / 2 ) 
in distribution in the space -D[o,i] with the Skorohod topology. □ 

Proof of Theorem 2.1. We assume for j = 1,2 

[nt]Var(/i(^-([nt]))) = [nt]vj(N([nt])/[nt], 0([nt]))(l + o P (l)) 

and 

[nt] Var(/i(0 3 -([ni]))) = [rt%-(p, 0), 
where u is a continuous function. We also assume 



[n%(N(M)/[nt],0([rat])) = [nt]vj(p,0) + O\ 
which holds for most circumstances, since 
N([nt])/[nt]=p + 0\ 



' log log [nt] 



[nt] 



a.s., 



log log[ni] 



[nt] 



a.s. 
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and 



So 

log log[nt] 



[nt]Var(/i(0j([nt]))) = [nt] Vax(h(Pj([nt]))) + O p | 



[nt] 



That is, [nt]Var(/i(0 J ([nt]))) converges to [nt] Vax(h(Oj([nt]))),j = 1, 2, in 
probability. By Slutsky's theorem, the sequential statistics 

N (N) A/r_.iO\ n h{e x {[nt\))-h{e 2 {[nt\)) 



/Var(/ l (6> 1 (H)))+Var(/ l (6> 2 (H))) 

and 

M^i([nt]))-/ i (0 2 ([nt])) 



fl t *(fl([nt]))=V* 



Var(/i(6>i([nt])))+Var(/t(6> 2 (H))) 



have the same distribution asymptotically. So we only need to prove the 
sequential statistics converges to Brownian motion in distribution. Now 

h(0j) - h{0j) = (9j - 0^(8^0^/80^' + o(\\Bj - 9j\\ 1+S ) 

= (9j - Qj){dh{Qj)/dQj)' + o(n- 1 / 2 - 5 / 3 ) a.s., j = 1,2. 
It is easy to see that 

Var[d j ([nt])]=Var[O j (n)]/t + o(n- 1 ) a.s.,j' = l,2. 
Here, we define 



C= v / Var[/ l (0 1 ([ni]))]+Var[/i(0 2 ([nt]))] v / Var[/i(0 1 ([ns]))]+Var[/i(0 2 ([ns]))] 
and 

d = {dh{e 1 )/de 1 , -dh{o 2 )/do 2 ). 

Then 

B K 0([nt])) = ■/! M^([nt]))-M^(M)) 

^Var(/ l (0 1 ([nt])))+Var(/ l (0 2 ([nt]))) 

^ Mgi) ~ H9 2 ) + (g(H) - fl)D' + oCn-V 2 - 3 / 3 ) 
/ Var(/i(0 1 ([nt])))+Var(/i(0 2 ([nt]))) 



22 H. ZHU AND F. HU 

By the conclusion of Lemma A.l: 

n- l l 2 ([nt\)(6([nt\)- ft) ^ (W 2t V 1 ' 2 ) 

in distribution in the space -D[o,i] with the Skorohod topology. Under Hq, 
we have 

\/Var(/ t (e 1 ([nt])))+Var(/ l (9 2 (M))) 

almost surely. So the sequential statistics -Bj* converges to a Gaussian process 
in distribution. In order to prove that B% converges to a "Brownian process" 
in distribution, it is enough to show EB% — > and for any < s < t < 1, 



In* r*\ rn/ti cov^^DUD' _ 5/3 

cov(5t ' BJ = RR g + 0(n } 



ny/is 3 / 2 DFD' , _ 5/3 



+ o(n- d / d ) 



[ni][ns] C 

Varlfi(r,\ - ft] -4- n(-W)Tl' 

+ o(n 



n^s 3 ' 2 D(nVar[fl(ra) - 0] + o(l))D' ^ n(n -s/^ 

n 2 Vts 3 / 2 dhjOJ/dOi Var[fli(w) - gi] gfa(gi)/gg£ 
[ni][ns] C 

n 2 y 7 ^ 3 / 2 dh{6 2 )/de 2 \ax[6 2 {n) - 6> 2 ] g^ggVggg 
+ [nt][na] C + ° ( j 

n 2 ^ 3 / 2 Var[/i(0i(n))l + Var[fc(fl 2 (n))l 

+o(l) 



[rai][ras] C 



n 2 ts 2 



[nt] [ns 
s a.s. 



It is easy to see that EB\ — > 0. This completes the proof and shows that Bt 
is asymptotical Brownian process in distribution. 
Under Hi, the sequential statistics 

B*(0([nt})) = St- { ° l{[nt])) " h ^ 2{[nt])) ~ " h{62)) 



Vax(/i(0i([nt])))+Var(fc(0 2 (N))) 
| .r t K0 1 )-h{e 2 ) 

Var(/i(0i([rai]))) + Var(/i(0 2 ([rai]))) 
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With similar proof, the first term converges to a standard Brownian motion 
in distribution asymptotically. Because 

MVar^CM))) = Vj f^Ml,e([nt})\ (1 + o(l)) a.s. j = 1,2, 



we have that 



Vt 



Var(/i(0i([ni])))+Var(/i(6> 2 ([ni]))) 



converges to 

v >v 1 (p,e) + v 2 ( P ,e) 

in probability. Therefore, under Hi, by Slutsky's theorem, — ^/n^t con- 
verges to a standard Brownian motion asymptotically. □ 

Acknowledgments. Special thanks go to anonymous referees, the Asso- 
ciate Editor and the Editor for the constructive comments, which led to a 
much improved version of the paper. 
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